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Abstract 

Answer-set programming (ASP) paradigm is a way of using logic to solve search prob¬ 
lems. Given a search problem, to solve it one designs a theory in the logic so that models of 
this theory represent problem solutions. To compute a solution to a problem one needs to 
compute a model of the corresponding theory. Several answer-set programming formalisms 
have been developed on the basis of logic programming with the semantics of stable mod¬ 
els. In this paper we show that also the logic of predicate calculus gives rise to effective 
implementations of the ASP paradigm, similar in spirit to logic programming with stable 
model semantics and with a similar scope of applicability. Specifically, we propose two 
logics based on predicate calculus as formalisms for encoding search problems. We show 
that the expressive power of these logics is given by the class NP-search. We demonstrate 
how to use them in programming and develop computational tools for model finding. In 
the case of one of the logics our techniques reduce the problem to that of propositional 
satisfiability and allow one to use off-the-shelf satisfiability solvers. The language of the 
other logic has more complex syntax and provides explicit means to model some high-level 
constraints. For theories in this logic, we designed our own solver that takes advantage 
of the expanded syntax. We present experimental results demonstrating computational 
effectiveness of the overall approach. 


1 Introduction 


Logic is most commonly used in declarative programming and computational knowledge rep¬ 
resentation as follows. To solve a problem, we represent its general constraints and relevant 
background knowledge as a theory. We express a specific instance of the problem as a for¬ 
mula. We then use proof techniques to decide whether this formula follows from the theory. 
A proof of the formula or, more precisely, variable substitutions used by the proof, determine 
a solution which, in most cases is represented by a ground term. This use of logic in program¬ 


ming and computing stems from the pioneering work by Robinson | Rob651, Green | Gre691 and 
Kowalski [Kow74]. It led to the establishment of logic programming as, arguably, the most 


'^Parts of this paper appeared appeared in Proceedings of AAAI-2000 |ET0^ and in Proceedings of KI-2001 
[|ET01b||. 
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prominent and most broadly accepted logic-based declarative programming formalism, and to 
the development of Prolog as its implementation by Colmerauer and his group |CKPR73|. 

Recently, researchers proposed an alternative way in which logic can be used in computation 
|MT99, Nie99(] . Answer-set programming (ASP) is an approach to declarative programming in 
which one represents a computational problem as a theory in some logic so that models of this 
theory, and not proofs or variable substitutions, represent problem solutions. In ASP, finding 
models rather than proofs is a primary computational task and serves as a foundation for a 
uniform processing mechanism. 

ASP was first explicitly identified as a declarative modeling and computational paradigm 
in knowledge representation applications of normal logic programs (programs with negation 
in the bodies) [|MT9E , Mie99|] . That work suggested that in order to model a problem, a 
programmer should write a normal logic program so that some distinguished models of the 
program represent solutions. The most broadly accepted 2-valued semantics of normal logic 
programs is the semantics of stable models |GL88|. Thus, under that proposal, a normal logic 
program solves a problem if answers to the problem are represented by stable models of the 
program and can be recovered from them. We refer to this variant of logic programming as 
stable logic programming (SLP). It is clear that SLP is a specific instance of the ASP paradigm 
described earlier. 

In general, stable models are infinite and, unless one can devise for them some finitary 
representation schema, they cannot be computed. To overcome this difficulty, it is common 
to restrict attention in SLP to finite DATALOG”' programs, that is, finite programs without 
function symbols. A stable model of a logic program is, by definition, an Herbrand model. 
Thus, if a program is a finite DATALOG”' program, its stable models are finite. Moreover, 
there are algorithms to compute stable models of finite DATALOG”' programs, as well as fast 
implementations of these algorithms, including smodels NSOGj ], dlv ELM'*~9^ ]^, cmodels |BL02] 
and assat [ LZ02 |. These implementations compute stable models of DATALOG”' programs in 
two steps. First, an input program is grounded, that is, replaced by a program consisting of 
ground clauses only (hence, it is a propositional program) that has the same stable models 
as the original one. Second, stable models of the ground program are computed by means 
of search algorithms similar to the Davis-Putnam algorithm for satisfiability (SAT) testing 
| NSOO| , [ELM''~98t] or, after some additional preprocessing, by SAT solvers ||BL02| , [LZ02[ . 

SLP is a declarative programming formalism particularly well suited for representing and 
solving search problems in the class NP-search, as it provides a uniform solution to each search 
problem in that class | MR01 |. Specifically, for every search problem 11 in the class NP-search 
there is a finite DATALOG”' program Pq; and an encoding schema that represents every in¬ 
stance / of Lf as a finite collection data(/) of ground atoms from the Herbrand base of Pq! 
such that stable models of the program Pn U data(/) specify all solutions to the instance I 
of the problem H. Problems such as scheduling, planning, diagnosis, abductive reasoning, 
product configuration, versions of bounded model checking, as well as a broad spectrum of 
combinatorial problems, are members of the class NP-search and so, admit a uniform solution 
in SLP. A recent research monograph [ Bar03 | presents SLP solutions for several of these prob¬ 
lems, especially those that appear in knowledge representation. The combination of uniform 
encoding, high expressive power and fast algorithms for computing stable models makes SLP 


^In fact, dlv implements an algorithm to compute stable models of disjunctive logic programs, a more general 
task. 
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an attractive and effective declarative programming formalism. Extensions of the language 
of DATALOG”' with explicit representations for constraints involving cardinalities and other 
aggregates, a corresponding generalization of the notion of a stable model, and modifications in 
algorithms to compute (generalized) stable models resulted in even more effective programming 
and computing systems [ NSOOl , $NS02 |. 

While the notion of the answer-set programming paradigm has first explicitly appeared 
in the context of SLP, it is clear that the way in which propositional SAT solvers are used 
fits well the ASP paradigm. For instance, in the satisfiability planning | KMS96f| , problems 
are encoded as propositional theories so that models determine valid plans. SAT solvers are 
then used to compute them. Other similar uses of SAT solvers abound. Our goal in this 
paper is to extend this general idea and show that predicate logic with the Herbrand-model 
semantics, together with SAT solvers and their extensions as processing engines, gives rise to 
effective implementations of the ASP paradigm similar in spirit to SLP and with a similar 
scope of applicability. A specific logic we propose to this end is a modification of the logic of 
propositional schemata | KMS96 |. The key concept is that of a data-program pair {D,P) to 
represent separately a search problem If by the program component P of (D, P), and concrete 
instances of 11 by the data part D of (D,P). To define the semantics of data-program pairs, 
we restrict the class of Herbrand models of the theory D L) P to those Herbrand models that 
satisfy a version of Reiter’s Closed-World Assumption. We refer to our logic as the logic of 
propositional schemata with Closed-World Assumption or, simply, as the logic of propositional 
schemata. We denote this logic by PS. 

The logic PS offers only basic logical connectives to help model problem constraints. We 
extend logic PS to support direct representation of constraints involving cardinalities. Ex¬ 
amples of such constraints are: ”at least k elements from the list must be in the model” or 
” exactly k elements from the list must be in the model”. They appear commonly in statements 
of constraint satisfaction problems. We also extend the language of the logic PS with Horn 
rules and use them as means to compute consequences of collections of ground facts (in partic¬ 
ular, to compute transitive closures of binary relations). We refer to this new logic as extended 
logic of propositional schemata (with Closed-World Assumption) and denote it by PS+. 

In the paper we study basic properties of the logic PS and observe that they extend to 
the logic PS+, as well. We show that the logic PS is nonmonotonic, identify sources of non¬ 
monotonicity and its implications. We demonstrate the use of the logic PS as a representation 
language be developing programs for several search problems. We characterize the class of 
problems that can be solved by programs in the logic PS. To this end, we define a formal 
setting for the study of the expressive power of ASP formalisms. We establish that the expres¬ 
sive power of the logic PS is equal to the class NP-search. In particular, it is the same as the 
expressive power of SLP. 

As we pointed out, in the logic PS, to solve a problem for a particular instance we represent 
the problem and the instance by a data-program pair so that Herbrand models of the data- 
program pair correspond to problem solutions. Consequently, the basic computational task 
is that of computing Herbrand models. It can be accomplished in a similar two-step process 
to that used in computing stable models. Given a finite data-program pair, we first ground 
it (compute its equivalent propositional representation) and then find models of the ground 
theory obtained. 

For grounding, we implemented a program, psgrndth.sX,, given a data-program pair produces 
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an equivalent (with the same models) propositional theory. If the input data-program pair is 
in the language of the basic logic PS, one of the options of psgrnd creates a theory in the 
DIMACS format and allows one to use for solving “off-the-shelf” SAT solvers. In this way, our 
logic PS and our program psgrnd provide a programming front-end for SAT solvers greatly 
facilitating their use. 

If a data-program pair contains higher-level constructs proper to the logic PS+, one still can 
use propositional solvers for processing by first compiling cardinality and closure constraints 
to propositional logic and then using SAT solvers to compute models. However, propositional 
representations of constraints involving cardinalities or closure operators are usually very large 
and the sizes of the compiled theories limit the effectiveness of satisfiability checkers, even the 
most advanced ones, as processing engines. Thus, we argue for an alternative approach to 
design solvers specifically tailored to the syntax of the logic PS+. To this end, we propose a 
“target” propositional logic for the logic PS+. In this logic, cardinality and closure constraints 
have explicit representations and, therefore, do not need to be compiled any further. We 
develop a satisfiability solver, aspps, for the propositional logic PS+ and use it as the processing 
back-end for the logic PS+ and psgrnd. Our solver is designed along the same lines as most 
satisfiability solvers implementing the Davis-Putnam algorithm but it takes a direct advantage 
of the cardinality and closure constraints explicitly present in the language. 

Experimental results on the performance of the overall approach are highly encouraging. 
On one hand, we demonstrate the ease and effectiveness of using off-the-shelf SAT solvers to 
attack search problems. On the other hand, we show that significant gains in performance can 
be obtained by developing more general solvers, such as aspps, capable of directly processing 
some classes of more complex constraints than those that can be expressed as clauses. In 
particular, our solver aspps is competitive with current SLP solvers such as smodels and with 
complete SAT solvers such as zchaff and satz. In fact, in several instances we considered, it was 
faster. Our work demonstrates that building propositional solvers capable of processing high- 
level constraints is a promising research direction for the area of propositional satisfiability. 

Several interrelated factors motivate us in this work. First, we want to provide an effective 
programming front-end that would capitalize on dramatic improvements in the performance of 
satisfiability solvers and would facilitate their use as computational tools. In recent years, re¬ 
searchers have developed several fast implementations of the basic Davis-Putnam method such 
as satz [LA9^], relsat [ BS97 | and, most recently, chaff | MMZ+01a , MMZ+Olb l. A renewed 
interest in local-search techniques resulted in highly effective (albeit incomplete) satisfiabil¬ 
ity checkers such as WALKSAT | SKC94|| , capable of handling large CNF theories, consisting 
of millions of clauses. These advances make computing Herbrand models of predicate-logic 
theories feasible. Second, the use of propositional semantics also makes it easy to expand 
the basic language with constructs to explicitly represent high-level constraints and to exploit 
ideas developed in the area of propositional satisfiability to design algorithms for computing 
models of ground theories in expanded languages. Third, the semantics of Herbrand models 
of predicate theories, being essentially the semantics of propositional logic, is broadly known 
and much less complex than the semantics of stable models of logic programs. Consequently, 
the task of programming may in many cases be simpler and the overall approach may gain 
broader acceptance. 
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2 Basic logic PS 


In this section, we introduce the logic PS that provides a theoretical basis for a declarative 
programming front-end to satisfiability solvers and facilitates their use. Syntactically, the logic 
PS is a fragment of first-order logic without function symbols (or, in other words, predicate 
logic). Specifically, the language of the logic PS consists of: 

1. infinite denumerable sets R, C and V of relation, constant and variable symbols 

2. symbols _L and T (later interpreted always as falsity and truth) 

3. boolean connectives A, V and the universal and existential quantifiers, and punctua¬ 
tion symbols ‘(’, ‘)’ and ‘,’- 

In the paper, following the example of logic programming, we adopt the convention that upper¬ 
case letters denote variables and lower-case letters stand for constants. 

Constant and variable symbols are the only terms of the language. Constants are the only 
ground terms and they form the Herbrand universe of the language. Atoms are expressions of 
the form p{ti ,..., tn), where p is an n-ary relation symbol from R and ti, 1 < i < n, are terms. 
An atom p{ti ,..., tn) is ground if all its terms are ground. The set of all ground atoms forms 
the Herbrand base of the language. 

In the logic PS, we restrict the use of existential quantifiers. Let us consider a tuple of 
terms (ti,..., tn) and let Xi,..., be pairwise distinct variables such that each Xj, 1 < f < A:, 
appears in the tuple (ti,..., tn) exactly once. An expression of the form 

3Xi, . . . , X/j p(ti, . . . ,tn) 

is an e-atom. For instance, the expression 3X, Z p{X, Y, Z, c) is an e-atom while the expression 
3X, Z p{X, X, Z, c) is not. In the logic PS, existential quantifiers appear exclusively in e-atoms. 

The requirement that each Xj, 1 < i < A:, appears in the tuple (ti,... ,tn) exactly once is 
not essential and can be lifted. We adopt it as it allows us to simplify the notation for e-atoms. 
Namely, we write an e-atom 

3Xi, ■ ■ ■ , X/j p(tl, . . . , tn) 
as 

p{t'l,---,tn), 

where t^ = ti, if U is not one of the variables Xi,... ,Xk, and U = (underscore), otherwise. 
For instance, we write an e-atom 3X, Z p{X, Y, Z, c) as p(_, Y, _, c). 

The only formulas we allow in the logic PS are rules, that is, formulas 

VXi,..., Xfc(Ai A ... A A™ ^ V ... V Bn), 

where all Aj, 1 < i < m, and Bj, 1 < j < n, are atoms, none of Aj’s is an e-atom (in 
other words, e-atoms do not appear in the antecedents of clauses) and Xi,..., X^. are the free 
variables appearing in Ai..., Am and Bi ,..., Bn- If m = 0, we replace the conjunct in the 
antecedent of the clause with the symbol T. If n = 0, we replace the empty disjunct in the 
consequent of the clause with the symbol T. 
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As usual, we drop the universal quantifiers from the rule notation. For instance, to denote 
the rule 

VX, F, Z(p(X, y) A p{Y, Z) ^ g(_, X) V q{Y, _) V r(Z)), 
which is already a shorthand for 

yX,Y,Z{p{X,Y) Ap{Y,Z) ^3W q{W,X)v3W g(y, VF) V r(Z)), 


we write 


p(X, y) A p(y, Z) ^ g(_, X) V q{Y, _) V r{Z). 


This notation is reminiscent of that commonly used for clauses in predicate logic. There is a 
key difference, though. Some of the atoms in the consequent of a rule may be e-atoms (as it is 
the case in the example just given). Thus, unlike in the case of clauses, a rule of the logic PS 
may contain the existential quantifier in the consequent. 

The last syntactic notion we need is that of a theory. A theory in the logic PS (or, a 
PS theory) is any finite collection of rules that contains at least one occurrence of a constant 
symbol. 

To recap the discussion of the syntax of the logic PS, it is essentially a fragment of the 
syntax the fist-order logic with the following restrictions and caveats: (1) function symbols are 
not allowed, (2) rules are the only formulas, (3) theories are hnite and contain at least one 
constant symbol, and (4) through the use of notational conventions, the quantifiers are only 
implicitly present in the language. 

The difference between the logic PS and the corresponding fragment of the first-order logic 
is in the way we interpret theories. Namely, we view a theory T as a representation of a certain 
class of models of T and not as a representation of logical consequences of T. In fact, due to 
the way we use the logic PS, the concept of provability plays virtually no role in logic PS. 
This is an essential departure from the classical first-order logic perspective. 

Specifically, we assign to a PS theory T a collection of its Herbrand models. The concepts 
of an Herbrand interpretation, of truth in an interpretation and of an Herbrand model, that we 
use in the paper, are standard (for details, we refer to any text in logic, for instance, | NS93(| ). 
Here we will only introduce some necessary notation. Let T be PS theory. We denote by 
HU(T) the Herbrand universe of T, that is, in our case, the set of all constants that appear in 
T. By the definition of a PS theory, this set is not empty and finite. We denote by HB(T) 
the Herbrand base of T, that is, the collection of all ground atoms p{ci,... ,Cn), where p is 
an n-ary relation symbol appearing in T and c* G HU{T), 1 < i < n. Following a standard 
practice, we identify Herbrand interpretations of T with subsets of HB{T). 

The restriction to Herbrand interpretations is important. In particular, it implies that the 
logic PS is nonmonotonic. Indeed, if Ti C T 2 are two PS theories, it is not necessary that every 
Herbrand model of T 2 is a Herbrand model of Ti. For example, let Ti = {T ^ p(-), p(«) —> T}, 
and let r 2 = Ti U {T —> p{b)}. It is easy to see that M = {p{b)} is a Herbrand model of T 2 
and that Ti has no Herbrand models. Indeed, HU{T\) = {a} and the only Herbrand model 
satisfying the first rule is M' = {p(a)}. This model, however, does not satisfy the second 
rule. In contrast, classical first-order logic is monotone: for every two collections of sentences 
Ti C r2, if M is a model of T 2 then it is a model of T\, as well. In the example discussed 
above, the Herbrand model specified by the subset {p{b)} of HB{T 2 ) is a model of Ti but not 
a Herbrand model of Ti. 
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The restriction to Herbrand models allows us also to develop algorithms to compute them. 
Namely, as in the case of stable logic programming, Herbrand models of a PS theory T can 
be computed in two steps. First, we ground T to a propositional theory that has the same 
Herbrand models as T. Next, we compute models of T by computing models of the ground 
theory. This latter task can be accomplished by off-the-shelf propositional satisfiability solvers. 

The concept of grounding is similar to that used in the context of universal theories in 
first-order logic or programs in logic programming. The only difference comes from the fact 
that rules in PS theories may include e-atoms in the consequents. We will now discuss the 
task of grounding in detail. Let p{t) be an atom that occurs in T. If p{t) is not an e-atom, 
we define = p{t). If p{t) is an e-atom (we assume that all e-atoms in T are given in the 
“underscore” notation), we define p^{t) as the disjunction of all atoms of the form p{t'), where 
t' is obtained from t by replacing all occurrences of the underscore symbol _ in t with constants 
from HU(T) (that is, constants that appear in T). For example, if a and b are the only two 
constants in T and p(_, X, _, a) is an e-atom in T, we have 

p'^i-, X, _, a) = p{a, X, a, a) V p{a, X, b, a) V p(b, X, a, a) V p(b, X, b, a). 

Further, for a rule r (zT, where 

r= AiA...Am^BiU...yBn, 


we define 

= AiA...Ara-> Bf\J ...\J Bf^. 

We note that = {r^: r G T} contains no occurrences of the underscore symbol. 

Let d be a ground substitution, that is, a mapping whose domain is a finite subset of the set 
of variables from the language, and which assigns constants from the language to variables in its 
domain. By an expression we mean a term, a list of terms, a formula without any occurrence of 
the underscore symbol, or a set of formulas without any occurrence of the underscore symbol. 
A ground substitution is defined for an expression E if every variable appearing in E belongs 
to the domain of d. Let E be an expression and let ?? be a ground substitution that is defined 
for E. By Ed we denote the expression obtained from E by replacing all variables occurring 
in E with their images under d. We call an expression of the form Ed a ground instanee of 
E. For an expression E, by gr{E) we denote the set of all ground instances of E. Finally, we 
extend the notion of grounding to PS theories which are not, in general, expressions as they 
may contain the underscore symbol. Namely, for PS theory T, we set gr{T) = gr{{r'^: r G T}). 

We will illustrate the concepts we introduced with an example. Let T be a PS theory that 
consists of the following two clauses: 

Ci= q{b,c)^p{a) 

C 2 = p{X)^q{X,_). 

To compute gr{T) we need to compute all ground instances of C 2 (Ci is already in the ground 
form). First, we compute the formula C 2 ' 

C^= p{X)^q{X,a)yq{X,b)Wq{X,c). 

To obtain all ground instances of C 2 (or C^), we replace X in with a, b and c and obtain 
the following three clauses: 


7 



p{a) -I- q{a, a) V q{a, b) V g(a, c) 
p{h) q{h, a) V q{b, b) V q{b, c) 
p{c) q{c, a) V q{c, h) V q{c, c). 

These three clauses together with Ci form gr{T). 

The following proposition establishes the adequacy of the concept of grounding in the 
study of models of PS theories. It also demonstrates that satisfiability provers can be used to 
compute models of PS theories. The proof of the proposition is simple and reflects closely the 
corresponding argument in the first-order case. Thus, we omit it. 

Proposition 2.1 Let T be a PS theory. A set M C HB{T) is a Herbrand model ofT if and 
only if M is a propositional model of gr{T). 

3 Equality and arithmetic in the logic PS 

In the following sections, we will often consider a version of the logic PS in which some 
relation symbols are given a prespecified interpretation. They are quality, inequality and basic 
arithmetic relations such as <, <, >, >, -f, *, —, /, etc. Inclusion of these relation symbols 
in the language is important as they greatly facilitate the task of programming (modeling 
knowledge, representing constraints as rules). We will use standard symbols to represent 
them, as well as the standard infix notation. In particular, we will write ti = t 2 rather than 
= (^ 1 ^ 2 )) and t = ti + tk (or ti + t 2 = t) rather than +{tiA2-,t)- We will denote the set of 
these relation symbols by EA. 

In this section, we will define the semantics for this variant of the logic PS. The idea is to 
interpret all symbols in the set EA according to their intended meaning. Specifically, let C be 
the set of constant symbols. We define a theory =c to consist of all clauses of the form 

1. T {t,t) (we will write them as T —f{t = t)), for every t € C, and 

2. = (s, t) —> T (we will write them as (s = t) —> T), for all s,t (z C such that s ^ t. 

Next, we define a theory +c to consist of all clauses of the form 

1. T ^ +{t,u,s) (we will write them as T {s = t + u)), for every integers s,t,u € C 
such that s = t + u, and 

2. +{t, u,.s)—^l- (we will write them as {s = t + u) ^ T), for every s,t,u G C such that at 
least one of s,t,u is not an integer, or s,t,u are integers and s t + u. 

In the same way we define theories pc for other relation symbols in EA such as <, —, *, etc. 
All these theories provide explicit intended definitions of the corresponding relation symbols. 
We will often refer to the relation symbols in EA as predefined since their interpretation is 
fixed. 

Let T be a PS theory in the language containing distinguished relation symbols from the 
set EA. Let C be the set of constants appearing in T (that is, C = HU(T)). A set M of 
ground atoms in the language is a model of T if M is a model of the theory T\J[^{pc'.p € EA} 
as defined above. 
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It is clear that, with the help of additional variables, we can express in the logic PS arbitrary 
arithmetic expressions. For instance, we will write 

q{{X + Y)*Z,a) AB ^ H 


and interpret this expression as 

(Ti = X + y) A (Ta = Ti * Z) A q{T2,a) AB^H 
Similarly, we will interpret 

B ^q{{X + Y)*Z, a)y H 

as the clause 

BA{n=X + Y)A{T 2 =Ti*Z)^ q{T 2 , a) V H. 

In each case, variables Ti and Ta are different from all other variables appearing in the rules. 
In order to obtain uniqueness of the interpretation, when decomposing arithmetic expressions, 
we follow the standard order in which they are evaluated. Let us emphasize that arithmetic 
expressions are simply notational shortcuts and not elements of the language. 

In the remainder of the paper, we will always assume that the language contains prede¬ 
fined relation symbols. Since their extensions are already fully specified, we will omit the 
corresponding ground atoms when describing models of theories. 

To illustrate these concepts, let us consider the following PS theory T: 

T={T^p(l), T^p(2), p{X)^q{X,X + l)}. 

This theory represents the following PS theory T': 

r' = {T^p(l), T^p(2), p{X)A{Y = X + l)^q{X,Y)}. 

The theory gr(T') consists of the first two rules (they are already ground) and the following 
four instantiations of the third rule: 

p(l)A(l = l + l)^(?(l,l) 
p(I)A(2 = l + l)^(?(l,2) 
p(2)A(l = 2 + l)^(?(2,l) 
p(2) A(2 = 2 + 2) ^g(2,2). 

Models of this theory are (by the definition) models of the theory T. For instance, 
{p{l),p{2),q{l, 2)} is a model of T. We point out that, according to our convention, we omitted 
from the model description the atom 2 = 1 -|- 1 (or, more formally, the atom = (1,1, 2)). 


4 Programming with logic PS 


The logic PS, described in the previous section, can be used as a basis for a declarative 
programming formalism based on the paradigm of answer-set programming | MT99(| . To this 
end, we need to introduce the concepts of input data and a program. We follow the approach 
proposed and studied in the area of relational databases ||U1188 |. A relational database can 
be viewed as a collection of ground atoms of some logic language. We often use (for instance. 
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in the context of DATALOG and its variants) the term extensional database to refer to a 
collection of ground atoms specifying a relational database. Queries are finite theories, often 
of special form, in this logic language (for instance, definite Horn theories without function 
symbols serve as queries in the case of DATALOG). Queries define new properties (relations) in 
terms of those relations that are explicitly specified by the underlying (extensional) database. 

Guided by these intuitions, we define a data-program pair to be a pair {D, P), where D is a 
finite set of ground atoms in a language of the logic PS and P is a finite collection of PS rules. 
We use data-program pairs to represent specific computational problem instances. We view D 
as an encoding of relevant input data and P as a declarative specification of the computational 
task in question. Accordingly, given a data-program pair (P,P), we refer to P as a data set 
and to P as a program. We use the term data predicate for all relation symbols appearing in 
atoms in D. We use the term program predicate to refer to all relation symbols that appear 
in P and are neither data predicates nor predefined predicates from EA. Intuitively, P is a 
counterpart of an extensional database and P is a counterpart of a database query. 

We will now introduce a semantics for data-program pairs. To this end, we will encode a 
data-program pair (P, P) as a theory in the logic PS. Since P is a set of ground atoms repre¬ 
senting the problem instance (input data), we assume that P provides a complete specification 
of the input. That is, we assume that no other ground atoms built of predicates appearing 
in P are true. Since the only formulas in the logic PS are rules, we encode the information 
specified by P as a set of rules cl{D) defined as follows. For every relation symbol p appearing 
in P and for every ground tuple t (with constants from the Herbrand universe of P U P) of 
appropriate arity, if p{t) G P, we include in c/(P) the clause 

T^p(t). 

Otherwise, if p{t) ^ P, we include in cl{D) the clause 

p{t) T. 

It is clear that the set cl{D) can be regarded as the result of applying Reiter’s Closed-World 
Assumption to P. 

We represent a data-program pair (P, P) by a PS theory cl{D) U P. We say that a set M 
of ground atoms, M C HB{cl{D)[J P), is a model of a data-program pair (P, P) if it is a model 
of cl{D) U P. We denote the set of all models of a data-program pair {D,P) by Mod{D, P). 

In separating data and program predicates and in adopting the closed-world assumption for 
the treatment of data atoms we are guided by the intuition that data predicates are intended 
to represent input data. Their extensions should not be affected by the computation. The 
effects of the computation should be reflected in the extensions of program predicates only. 

We designed the logic PS and introduced the concept of a data-program pair to model 
computational problems. To illustrate this use of our formalism, we show how to encode 
several well-known search problems by means of data-program pairs. We assume that the 
language contains predefined relation symbols to represent equality and arithmetic relations. 

We start with the graph /c-colorability problem: given an undirected graph and a set of k 
colors, the objective is to find an assignment of colors to vertices so that no two identically 
colored vertices are joined with an edge (or to determine that no such coloring exists). 

We set 

Dgci{G, k) = {vtx{v):v G R} U {edge{v, w): {v, rc} G E} U {color{i): 1 < i < k}. 
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The set of atoms Dgd represents an instance of the coloring problem. The predicates vtx, edge 
and color are data predicates. Their extensions define vertices and edges of an input graph, 
and the set of available colors. 

Next, we construct a program, Pgch encoding the constraints of the problem. It involves 
predicates vtx, edge and color, specified in the data part, and defines a new relation clrd that 
models assignments of colors to vertices. 

Cl: clrd{X,C) ^ vtx{X) 

C2: clrd{X,C) ^ color(C) 

C3: vtx{X) ^ clrd{X, _) 

C4: clrd{X, C) A clrd{X, D) ^ {C = D) 

C5: edge{X, Y) A clrd{X, C) A clrd{Y, C) ^ Y. 

The condition (Cl) states that the only objects that get colored are vertices. Indeed, by 
the definition, a model of the theory {Dgci{G, k), Pgd) contains an atom vtx{x) if and only if x 
is a vertex of an input graph. Thus, if clrd{v,c) belongs to a model of {Dgci{G, k), Pgd), then 
vtx{v) belongs to the model and, so, u is a vertex. Similarly, (C2) states that the only objects 
assigned by the predicate clrd to a vertex are colors. (C3) states that each vertex X gets 
assigned at least one color. (C4) enforces that each vertex is assigned at most one color. (C5) 
ensures that two vertices connected by an edge are assigned different colors. These clauses 
correctly capture the constraints of the coloring problem. 

Proposition 4.1 Let G = {V,E) be an undirected graph and let k be a positive integer. 
An assignment f:V —> {1,...,A:} is a k-coloring of G if and only if M = Dgd{G,k) U 
{clrd{v, f{v)): V € V} is a model of the data-program pair {Dgci{G, k), Pgd)■ 

Proof; (=i>) Let us assume that f:V —> {1,..., A:} is a /c-coloring of G. We will show that 
M = Dgd{G,k) U {clrd{v, f{v)):v € V} is a model of {Dgd, Pgd), that is, it is a model of 
gr{cl{Dgd{G,k)) U Pgd)- From the definition of M it follows that M satisfies all rules in 
cl{Dgci{G, k)). We will now show that M satisfies all rules in gr{cl{Dgci{G, k)) U Pgd) that are 
obtained by grounding rules in Pgd- 

First, we consider an arbitrary ground instance of rule (Cl), say, clrd(x,c) —> vtx{x), 
where x and c are two constants of the language. It is clear from the definition of M that if 
clrd{x,c) G M, then x & V and, consequently, vtx{x) G M. Thus all ground instances of (Cl) 
are satisfied by M. 

Next, we consider a ground instance r of rule (C3), say, 

r = vtx{x) \J{clrd{x, c): c G C U {1,..., /c}}, 

where x G C U {!,..., /c}. If vtx{x) G M, then x G P. Since /(x) G {!,..., /c}, and 
clrd{x, f{x)) G M, it follows that r is satisfied by M. All other rules can be dealt with in 
a similar way. 

(<i=) We will now assume that M is a model of {Dgd, Pgd)- By the definition of a model, we 
have (1) vtx{x) G M if and only if x G P, (2) edge{x, y) G M if and only if {x, y} G E, and (3) 
color{i) G M if and only if i G {1,..., k}. 

Now, we observe that since M satisfies all ground instances of (Cl), if clrd{x, c) G M, then 
X G P. Similarly, since M satisfies all ground instances of (C3), for every x G P there is at 
least one constant c such that clrd{x, c) G M. On the other hand, since M satisfies all ground 
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instances of (C2), for each such constant c, c ^ {1,... , k}. Next, we have that M satisfies all 
ground instances of (C4). Consequently, for every x ^ V there is exactly one c e {1,..., fe} 
such that clrd{x,c) € M. Let us denote by / the function that assigns to each x € C the 
unique c € {1,..., fc} such that clrd{x, c) G M. It follows that M = DU {clr{v, f{v)):v GV}. 
Moreover, as M satisfies all ground instances of (C5), / is a ^-coloring of G. □ 

Let us note that the proof of Proposition 4.1 implies, in fact, that the correspondence 
between models and colorings is a bijection. 

Next, we will describe a data-program pair encoding an instance of the vertex-cover problem 
for graphs. Let G = (V, E) be a graph. A set VL C 1/ is a vertex cover of G if for every edge 
{x, y} G E, X or y (or both) are in W. The vertex-cover problem is defined as follows: given 
a graph G = {V,E) and an integer k, k < |IL|, decide whether G has a vertex cover with no 
more than k vertices. 

For the vertex-cover problem the input data is described by the following set of ground 
atoms: 


Dyc{G, k) = {vtx{v)-.v G 14 } U {edge{v, w): {u, rc} G E} U {index{i): i = 1,..., k}. 

This set of atoms specifies the set of vertices and the set of edges of an input graph. It also 
provides a set of k indices which we will use to select a subset of no more than k vertices in 
the graph, a candidate for a vertex cover of cardinality at most k. 

The vertex cover problem itself is described by the program Pyc- It introduces a new 
relation symbol vc. Intuitively, we use vc to represent the fact that a vertex has been selected 
to a candidate set. 

VCl: vc{I,X) ^ vtx{X) 

VC2: vc{I,X) —> index{I) 

VC3: index{I) vc{I, _) 

VC4: vc{I, X) A vc(I, Y) ^ X = Y 
VC5: edge(X, Y) vc(., X) V vc(., Y). 

(VCl) and (VC2) ensure that vc(i, x) is false if i is not an integer from the set {1,..., A:} or 
if X is not a vertex (that is, if vc{i, x) is true, z G {1,... , /c} and x G V). The rules (VC3) and 
(VC4) together impose the requirement that every index i has exactly one vertex assigned to 
it. It follows that the set of ground atoms vc{i, x) that are true in a model of the data-program 
pair {Dyc{G, k), Pyc) defines a subset of V with cardinality at most k. Finally, (VC5) ensures 
that each edge has at least one end vertex assigned by vc to an index from {1,..., A:} (in 
other words, that vertices assigned to indices 1,..., A: form a vertex cover). The correctness 
of this encoding is formally established in the following result. Its proof is similar to that of 
Proposition |4.1| and we omit it. 

Proposition 4.2 Let G = (V, E) he an undirected graph and let k be a positive integer. If 
W CV is a vertex cover of G and \ W\ < k, then for every sequence wi,... ,Wk that enumerates 
all elements in W (possibly with repetitions), M = Dyc{G,k) U {vc{i,Wi):i = 1,...,A:} is a 
model of the data-program pair {Dyc{G, k), Pyc). Conversely, if M is a model of {Dyc{G, k),Pyc) 
then the set W = {w G V: vc{i,w) G M, for some i = 1,..., k} is a vertex cover of G with 
|IT| < k. 

In this case, we do not have a one-to-one correspondence between models and vertex covers 
of cardinality at most k. It is so because we represent sets by means of sequences. 
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Next, we will consider the Hamiltonian-cycle problem in a directed graph. To represent an 
input graph G = {V, E) we use the following set of ground atoms; 

Dhc{G) = {vtx{x): X G y} U {edge{x, y): (x, y) G E} U {index{i):i = 1,..., |1/|}. 

The set of indices is introduced as part of input because we will represent a Hamiltonian cycle 
by a bijective sequence of vertices such that every two consecutive vertices in the sequence, as 
well as the last and the first, are connected with an edge. To represent such sequences we use a 
relations symbol hc_perm. The program, Phc, defining “Hamiltonian” sequences hc-perm{i,x) 
looks as follows. In this example we assume that © denotes a predefined relation of addition 
modulo n defined on the set of integers {1,... , n} (thus, in particular, n © 1 = 1). 

HCl: hc-perm{I, X) —> index{I) 

HC2: hc-perm{I, X) vtx{X) 

HC3: index{I) —> hc-perm{I, J) 

HC4: hc-perm{I, X) A hc-perm{I,Y) ^ X = Y 

HC5: hc-perm{I, X) A hcjperm{ J,X) I = J 

HC6: hc_perm{I, X) A hc_perm{I © 1,T) ^ edge{X,Y). 

The first two rules ensure that if hc-perm{i^x) is true in a model of (DhcPhc) then i is an 
integer from the set {1,..., |H|} and x G H. The rules (HC3) - (HC5) together enforce the 
constraint that hc-perm defines a permutation of vertices. Finally, the last rule imposes the 
Hamiltonicity constraint that from every vertex in the sequence to the next one (and from the 
last one to the first one, too) there is an edge in the graph. Formally, we have the following 
result (the correspondence it establishes is not one to one as a Hamiltonian cycle can be 
represented by |H| different permutations, each being a cyclic shift of another). 

Proposition 4.3 Let G = {V, E) he a directed graph with n vertices. A permutation vi,... ,Vn 
of all vertices of V is a Hamiltonian cycle if and only if M = Dhc{G) U {hc-perm{i,Vi):i = 
1,... ,n} is a model of the data-program pair {Dhc{G),Phc)- 

We will next consider the n-queens problem, that is, the problem of placing n queens on a 
n X n chess board so that no queen attacks another. The representation of input data specifies 
the set of row and column indices: 


Dnq{n) = {index{i)-. i = 1,..., n}. 

The problem itself is described by the program Pnq. The predicate q describes a distribution 
of queens on the board: q{x,y) is true precisely when there is a queen in the position (x,y). 


nQl 

q{R,G)- 

index {R) 



nQ2 

q{R,G)- 

index{C) 



nQ3 

index{R) 

-^q{R,-) 



nQ4 

q{R,Gl) Aq{R,G2) 

(Cl 

= G2) 

nQ5 

qiRl^G) Aq{R2,G) 

{Rl 

= R2) 

nQ6 

q{R,G)A 

q{R + /, C* 4 

I)- 

T 

nQ7 

q{R,G)A 

q{R + I,G- 

I)- 

T 


The first two rules ensure that if q{r,c) is true in a model of {Dnq,Pnq) then r and c are 
integers from the set {1,... ,n}. The rules (nQ3) - (nQ5) together enforce the constraint that 
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each row and each column contains exactly one queen. Finally, the last two rules guarantee 
that no two queens are placed on the same diagonal. As in the other cases, we can formally 
state and prove the correctness of this encoding. The proof is again quite similar to that of 
Proposition 4.1 and so we we omit it. 


Proposition 4.4 Let n he a positive integer. A set of points on an n x n board, {(rj,Cj):i = 
1,2, ...,n}, is a solution to the n-queens problem if and only if the set M = Dnq{n) U 
{q{ri, Cj): i = 1,2,..., n} is a model of the data-program pair {Dnq{n), Pnq). 


As in the case of the graph-coloring problem, the correspondence between models and valid 
arrangements of queens on the board is a bijection. 

For the last example in this section, we consider computing the transitive closure of a finite 
directed graph G = {V, E), where P is a set of vertices and FI is a set of directed edges (we will 
assume that G has no loops). We recall that the transitive closure of the graph G = (V, E) is 
the directed graph iV, E') such that an edge (x, y) belongs to E' if and only if there is in G a 
directed path from x to y of length at least 1. 

We will now describe the representation of data instances and give a PS program solving 
the transitive closure problem. The data instance consists of a specification of an input graph 
iy,E) and of a collection of integers, {1,2,..., |P|} that will allow us to count edges in the 
paths. Thus, we set 


Dtc{G) = {vtx{v)\ u G P} U {edge{v, w): {v, re} G E} U {index{i)\ 1 < f < k}. 

Next, we construct a program, Ptc, encoding the constraints of the problem. Our encoding 
uses an auxiliary 4-ary relation symbol path. The intended meaning of path{X, Y, Z, I) is that 
it is true precisely when there is a directed path from A to T such that Z is the immediate 
predecessor of Y on the path and the path length is at most I. In Ptc we define the relation 
path and use it to specify the relation tc that represents the transitive closure of the input 
graph. 

TCI: path{X, y, Z, I) vtx{X) 

TC2: path{X, Y, Z, I) vtx{Y) 

TC3: path lx, Y, Z, I) vtx{Z) 

TC4: path {X, Y, Z, I) index{!) 

TC5: tc{X,Y) ^ vtx{X) 

TC6: tc{X,Y) ^ vtx{Y) 

TC7: path{X, Y, X, 1) ^ edge{X, Y) 

TC8: edgelx, Y) path{X, Y, X, 1) 

TC9: path{X, Y,Z,1) ^ X = Z 

TCIO: path{X, Y,Z,I-\-l) —> path{X, Z, _, I) 

TCll: path{X, Y,Z,I + 1)^ edge{Z, Y) 

TC12: path{X, Z, IT, I) A e{Z, Y) path{X, Y,Z,I + 1) 

TC13: tc{X, Y) path{X, Y, _) 

TC14: path{X,Y,Z,I) ^ tc{X,Y). 

The first four rules enforce that if an atom path{x, y, z, i) is in a model of the data-program 
pair {Dtc{G), Ptc) then x, y and z are vertices {vtx{x), vtx{y) and vtx{z) hold) and i is an index 
{index{i) holds). The effect of the next two rules is similar but they concern the relation symbol 
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tc. The rules (TC7) - (TC9) enforce conditions that atoms path{x,y, z, 1) must satisfy to be 
in a model. The rules (TCIO) - (TC12) enforce recursive conditions that atoms path{x,y, z,i), 
i > 2, must satisfy in order to be in the model. Finally, the rules (TC13) - (TC14) define the 
relation symbol tc in terms of the relation path. 

The following result can now be proved by an easy induction. 

Proposition 4.5 LetG he a directed graph. The data-program pair {Dtc{G),Ptc) has a unique 
model that consists of (1) all atoms in Dtc{G), (2) all atoms path{x,y, z,i) such that there is 
a path in G from x to y of length i and with z being the last but one vertex on this path, and 
(3) of all atoms tc{x,y) such that there is a directed path of positive length from x to y in G. 


We have chosen to discuss in detail the question of the transitive closure since it is well 
known that this property is not definable in first-order logic [AV91|. We can define it in our 
logic PS because our notion of definability is different: data-program pairs define concepts as 
special Herbrand models. A more detailed discussion of these issues follows in the next section. 


5 Expressive power of the logic PS 


Our discussion in the previous sections demonstrated the use of the logic PS as a tool to 
represent computational problems. In this section, we will study the expressive power of the 
logic PS, that is, we will identify a class of computational problems that can be represented 
by means of finite PS programs. 

We first recall some database terminology [U1188|. Let Dom be a fixed infinite set (for 


instance, the set of all natural numbers). A relational schema over a domain Dom is a nonempty 
sequence R= (ri,..., r^) of relation symbols. Each relation symbol r* comes with integer arity 
Oj > 0. An instance of a relation schema ii is a nonempty and finite set of ground atoms, each 


of the form ri{ui ,..., UaJ, where 1 < i < k and ui,. 


,u, 


ai 


G Dom. By T{R) we denote the 


set of all instances of a relational schema R. Since Dom is fixed, form now on we will not 
explicitly mention it. We also emphasize that, unlike in standard presentations, we require 
that instances of a relation schema be nonempty. 

Relational schemas provide a framework for a precise definition to a class of computational 
problems known as search problems. Let R and S be two disjoint relational schemas. A search 
problem (over relational schemas R and S) is a recursive relation LI C T{R) x T{S). The set 
I{R) is the set of instances of LI. Given an instance I G T{R), the set {J G T{S) : (/, J) G 11} 
is the set of solutions to II for the instance /. 

Search problems abound. It is clear that the graph problems and the n-queens problem 
considered earlier in the paper are examples of search problems. More generally, all constraint 
satisfaction problems over discrete domains, including such basic AI problems as planning, 
scheduling and product configuration, can be cast as search problems. 

A search language or {language for short) is a set L of expressions and a function y such 
that for every expression e G L, p{e) is a search problem. We call y, the interpretation function 
for L. By the expressive power of a language L we mean the class of search problems defined 
by expressions from L: {y{e): e G L}. 

We note that the concept of a search problem extends that of a database query ||Var82|] , 
which is defined as a partial recursive function from T{R) to T{S). Consequently, fragments of 
search languages consisting of those expressions that define partial functions are, in particular. 
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database query languages. In fact, one can regard a search problem as a second-order query — 
a mapping from the set of instances of some relational schema R into the power set of the set 
of instances of another (disjoint) relational schema S. Pushing the analogy further, a search 
language can be viewed as a second-order database query language — an expression in such 
a language defines, given an instance of a relational schema R, a collection of instances of a 
relation schema S rather than a single instance. 

We will show that the logic PS gives rise to a search language and establish its expressive 
power. An expression is a pair {P,R,S), where P is a PS program, and R and S are disjoint 
nonempty sets of relation symbols in P. We will show that (P, R, S) can be viewed as a 
specihcation of a search problem over relational schemas R and S. Namely, let D G 2{R). For 
every set M C HB{D UP), by M[S] we denote the set of all those atoms in M that are built 
by means of relation symbols from S. We define the interpretation function n as follows; 


Pl{P, R, S) = {{D, P): D £ I{R), and P = M[S], where M £ Mod{D, P)}. 


It is clear that fi{P,R,S) C I{R) x I{S). Consequently, the set of PS expressions together 
with the function // is a search language. 

In a similar way we can view as a search language the language of DATALOG”' (logic 
programming without function symbols) with the semantics of Herbrand models, supported 
models [ |Cla78 , Apt90| or stable models |GL88 |. Since the expressive power of DATALOG”' 
with the supported-model semantics will play a role in our considerations, we will recall relevant 
notions and results^. 

Let £ be a language of predicate logic. A DATALOG”' clause is an expression r of the form 


r= p(A)^(?i(Ai),...,g^(A 

m)i ~'Qm+l(,^m+l): ■ ■ • ~'Qm+n{^m+n) ^ 


where p,qi,, qm+n are relation symbols and X,Xi,..., Xm+n are tuples of constant and 
variable symbols with arities matching the arities of the corresponding relation symbols. We 
call the atom p{X) the head of the clause r and denote it by h{r). If a clause has empty body, 
we represent it by its head (thus, atoms can be regarded as clauses). For a clause r we also set 

B{r^ — 9i(^i) A ... A ^m(A^m) A ~i(j'm_|_i(Am_|_i) A ... A ~'qn{Xn). 

A DATALOG”' program is a collection of DATALOG”' clauses. Let P be a DATALOG”' 
program. As usual, we call relation symbols that appear in the heads of clauses in P intentional. 
We refer to all other relation symbols in P as extensional. We denote the sets of intentional 
and extensional relation symbols of a DATALOG”' program P by /(P) and P(P), respectively. 
Next, for a relation symbol p that appears in P, we denote by Def{p) the set of all clauses in 
P whose head is of the form p{t), for some tuple t of constant and variable symbols. In other 
words, Def{p) consists of all clauses that define p. 

In the paper we restrict our attention to DATALOG”' programs of special form, called I/O 
programs, providing a clear separation of data facts (ground atom representing data) from 
clauses (dehnitions of intentional relation symbols). To this end, we define first a class of pure 
programs. We say that a DATALOG”' program P is pure if 

®A11 concepts related to DATALOG^ that we mention here can be defined in a more general setting of logic 
programming lang uages th at include function symbols. For an in-depth discussion of logic programming, we 
refer the reader to |Apt9C |. 
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1. for every relation symbol p € liP), all clauses in Def{p) have the same head of the form 
p{X), where X is a tuple of distinct variables 

2. P contains no occurrences of constant symbols 

3. E{P) / 0. 

Pure programs are, in particular, in the so-called normal form as they satisfy condition (1) 

I Apt90| ] ■ An I/O program is a DATALOG”' program of the form D U P, where P is a pure 
program and D G Z{E{P)) (that is, Z) is a nonempty and finite set of ground atoms built of 
relation symbols in E{P)). To simplify the discussion, we define supported models for I/O 
programs only. It does not cause any loss of generality. Indeed, one can show that for every 
logic program containing at least one constant symbol there is an I/O program with the same 
intentional relation symbols and such that supported models of both programs, when restricted 
to intentional ground atoms, coincide (that is, under the semantics of supported models, both 
programs define the same relations). 

Let P be a pure program and let D G Z{E{P)). For a predicate p from I{P), we define its 
(Clark’s) completion cc{p) as 

cc{p) = p{X) B{r)-r S Def{p)}, 


where A" is a tuple of variables and 1/ is the tuple of distinct variables occurring in the body 
of r but not in the head of r (we exploit the normal form of P here) [|Cla78 |. We define the 
(Clark’s) completion of P, C'C'(P), by setting 


CC{P) = {cc{p)\p G Pr}. 


Finally, we define a set of ground atoms M C HB{D U P) to be a supported model of an I/O 
program PUP if it is a Herbrand model of cl{D) U CC{P), where cl{D) is defined as in Section 
1^ We denote by Sup{D U P) the collection of all supported models of P U P. 

Let P be a pure program and let S C I(P). We define 

z/(P, S) = {(P, P): P G Z{E{P)), and F = M[S], where P / 0 and M G Sup{D U P)}. 


Since E(P), I{P) and S can be regarded as relational schemas, ^{PjS) is a search problem. 
Thus, the set of expressions (P, S'), where P is a pure program and S is a subset of /(P), 
together with the function v form a search language. 

The expressive power of this language is known. A search problem 11 over relational schemas 
R and S is in the class NP-search if there is a nondeterministic Turing Machine TM such that 

1. TM runs in polynomial time 

2. for every instance I of the schema R (input instance of 11), the set of strings left on the 
tape when accepting computations for I terminate is precisely the set {J G Z{S): (I, J) G 
n}, that is, the set of solutions to 11 for the input I. 

The class NP-search is precisely the class of search problems captured by finite DATALOG”' 
programs with the supported-model semantics. 
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Theorem 5.1 ([ MR01 |) For every finite pure program P and every S C I{P), u{P,S) is a 
search problem in the class NP-search. Conversely, for every problem II in the class NP-search 
there is a pure program P and a set S C I{P) such that i'{P, S) = II. 


We will now show that the expressive powers of PS and of DATALOG 
model semantics are the same. Namely, we will prove the following result. 


with supported 


Theorem 5.2 For every finite pure program P and every set S C I{P), there is a finite 
PS program P' such that E{P) U I{P) are among the relation symbols appearing in P' and 
i'{P,S) = fi{P', E{P), S). Conversely, for every finite PS program P' and every nonempty 
and disjoint sets R and S of relation symbols appearing in P', there is a finite pure program 
P such that R = E{P), S C I{P) and pl{P', R, S) = v{P, S). 

Proof; Let P be a pure program. We will consider the completion CC{P) of P and construct 
its equivalent representation in terms of PS rules (we recall that PS rules are just special 
formulas from the language of predicate logic). 

We build this representation of CC{P) as follows. Let p be a predicate symbol in I{P). 
Let us assume that p{X), where A is a tuple of distinct variables, is the common head of all 
clauses in Def{p). Let us consider a clause r G Def{p), say 

r= p{X)^q^{Xi),...,qmiX 

and let Yj- be a tuple of distinct variables that appear in the body of r but not in its head. We 
introduce a new predicate symbol dr, of the arity \X\ + |W| and define the following PS rules 

'fii{r)= dr{X,Yr) qfiXi), i = l,...,m 
fjiir) = dr{X,Yr) AqfiXi) ^ ±, i = m + l,...,n 

~ 9l(^l) A ... A qm{Xm) ^ dr{X, Yr) V qrn+l{Xm+l) V ... V qn{Xn). 

We dehne 'l'(r) = {'iljo{r),ipi(r),... ,^lJn{r)}. It is clear that ^'(r) entails (in the first-order 
logic) the universal sentence dr{X,Yr) B{r) (intuitively, 'L(r) specifies dr{X,Yr) so that it 
can be regarded as an abbreviation for B{r)). 

We will now use atoms dr{X, Yr) to define PS rules that form an equivalent representation 
to the formula cc{p). Let us recall that 

cc{p) = p{X)AA\J{3Yr B{r):r ^ Def{p)}. 

Thus, we dehne the following PS rules: 

cc'r{p)= dr{X,Yr) ^ p{X), r^Def{p) 
cc'{p) = p{X) ^ \J{3Yrdr{X,Yr):r G Def{p)}. 

It is clear (by hrst-order logic tautologies) that 

<l>(p) = {'L(r):r G Def{p)} U {cc(,(p):r G Def{p)} U {cc'(p)} 

and cc{p) have the same hrst-order models (modulo new relation symbols dr). 

Let us dehne P' = U{‘^’(p)'P ^ ^(P)}- Clearly, P' is a PS program and every relation 
symbol in E{P) U I{P) in P'. Moreover, by the comment made above, for every instance D 
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of the schema E{P), cl{D) U P and cl{D) U P' have the same models and, in particular, the 
same Herbrand models (again modulo new relation symbols). Thus, v{P,S) = fi{P, E{P), S). 

We will now prove the second part of the assertion. Let P' be a PS program and let R 
and S be nonempty and disjoint sets of relation symbols appearing in P'. By Theorem 5.1, it 
is enough to show that the search problem fj.{P', R, S) belongs to the class NP-search. This 
is, however, straightforward. A nondeterministic Turing machine M (as defined in [GJ79|) for 
solving fj.{P',R,S) can be described as follows: 


1. Given an instance D G T(i?), M grounds (in a deterministic way) the data-program pair 
{D,P'). Since P' is fixed, the task can accomplished in polynomial time with respect to 
the size of D (measured as the total number of symbols in D). 


2. M generates in a nondeterministic fashion (using its guessing module) a subset of the 
Herbrand base. This task involves the number of guesses that is not greater than \HB{Dyj 
P')\, again a polynomial in the size of D. 


3. Next, M checks (deterministically) that the subset that was guessed is a model of the 
ground theory. This task can be accomplished in time that is polynomial in the size 
of the grounding of the data-program pair (D, P) which, as we already pointed out, is 
polynomial in the size of D. 


4. If the subset that was guessed is not a model, M moves to halting state NO. Otherwise, 
M rewrites the contents of the tape so that only these ground atoms of the supported 
model that are built of relation symbols in S are left, and moves to halting state YES. 

It is clear that tape contents for accepting computations are precisely projections of models of 
the data-program pair (ID, P) onto S. That is, M solves H nondeterministically in polynomial 
time. It follows that fi{P', R, S) is in the class NP-search. □ 


Corollary 5.3 A search problem H is in the class NP-search if and only if there is a finite 
PS program P and nonempty disjoint sets R and S of relation symbols appearing in P such 
that n = pl{P, R, S). 


Decision problems can be viewed as special search problems. Thus, in particular, every 
decision problem in the class NP can be expressed by means of a finite PS program (and two 
nonempty disjoint sets of relation symbols appearing in it). This observation is a counterpart 
to a result by Schlipf concerning DATALOG^ [pch95|| . 


Corollary 5.4 A decision problem H is in the class NP if and only if there is a finite PS 
program P and nonempty disjoint sets R and S of relation symbols appearing in P such that 
n = ^i{P,R, S). 


6 Extensions of the logic PS 

From the programming point of view, the logic PS provides a limited repertoire of modeling 
means: constraints must be represented as rules (essentially, standard clauses of predicate 
logic). We will now present ways to enhance the effectiveness of logic PS as a programming 
formalism. Namely, we will introduce extensions to the basic formalism of the logic PS to 
provide direct support representations of some common “higher-level” constraints. We denote 
this extended logic PS by PS-\-. 
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6.1 Adding cardinality atoms 

When considering the PS theories developed for the n-queens and vertex-cover problems one 
observes that these theories could be simplified if the language of the logic PS contained direct 
means to capture constraints such as: “exactly one element is selected” or “at most k elements 
are selected”. 

We already noted in the introduction that extensions of the language of DATALOG^ with 
explicit constructs to model such constraints and the corresponding modifications in the al¬ 
gorithms to compute stable models resulted in significant performance improvements. These 
gains can be attributed to the fact that programs in the extended language are usually much 
more concise, their ground versions use fewer variables and have smaller sizes. Thus, the search 
space of candidate models is also smaller. 

It is natural to expect that similar gains are also possible in the case of our formalism. 
With this motivation in mind, we extend the language of the logic PS by cardinality atoms. 
We we first consider a propositional language specified by a set of atoms At. By a proposi¬ 
tional cardinality atom (propositional c-atom, for short), we mean any expression of the form 
m{pi, ... ,pfc}n (one of m and n, but not both, may be missing), where m and n are non¬ 
negative integers and pi,... ,pk are atoms from At. The notion of a rule generalizes in an 
obvious way to the case when propositional c-atoms are present in the language. Namely, a 
c-rule is an expression of the form 

C = AiA...AAs^BiV...VBt, 

where all A and Bi are (propositional) atoms or c-atoms. 

Let M C At be a set of atoms. We say that M satisfies a c-atom m{pi,... ,Pk}n if 

m< \M r\{pi,... ,pk}\ <n. 

If m is missing, we only require that \M n {pi,... ,pk}\ < n. Similarly, when n is missing, we 
only require that m < \M Cl {pi,... ,pk}\. A set of atoms M satisfies a c-rule C if M satisfies 
at least one atom Bj or does not satisfy at least one atom Aj. 

For example, if At = {a, b, c, d}, then the expression 

a —> 2{a, c, d} V d 

is a clause. The set M = {a, c} is its model while M' = {a, 6} is not. 

To generalize the idea of a cardinality atom to the language of predicate calculus, we need 
a syntax that will facilitate concise representations of sets. To this end, we will adapt the 
standard set-theoretic notation, where 

{p{x):x G X and (/(x)} 

denotes the set of all atoms of the form p{x) for which x € X and q{x) holds. For instance, in 
the language used for modeling the n-queens problem an expression 

{q{R, C): index{C) and R < C} 

will be interpreted as a template for defining sets. For every ground instantiation r of its 
“free” variable R, it gives rise to the set (n is a constant dehned in the data file of the n-queens 
data-program pair) 

{q{r, c): c = r,r + 1,..., n}. 
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Formally, we define a cardinality atom or c-atom, for short, to be any expression 

l{Si-S2;...;Sk}u, 

where I and u are terms (constants or variables) and Si, S 2 , ■ ■ ■, Sk are set definitions. Intu¬ 
itively, the meaning of a c-atom l{Si‘, S 2 ', ■ ■ ■] Sk}u is that at least I and no more than u of the 
atoms specified by set definitions Si,... ,Sk are true. We will now make this intuition precise. 
Our definitions are similar to those proposed in [pN02|| in the context of SLP. 

A set definition is an expression of the formp(t) : di(si) A.. .f\dm{sm)i wherep is a program 
relation symbol, di, 1 < i < m, are data or predefined relation symbols, and t, Si, 1 < i < m, 
are tuples of terms. We note that it is possible that m = 0. We also note that this concept 
is defined only in the context of data-program pairs as in that case there is a clear distinction 
between data and program predicates. A variable appearing in a set definition as an argument 
of one of data relation symbols is bound. Other variables appearing in this set definition are 
free. 

Let S = p{t) : di(si) A ... A dm{sm) be a set dehnition appearing in a data-program pair 
T. By our assumption, T contains at least one constant. For every ground substitution d 
whose domain contains all free variables in S and does not contain any bound variables from 
S, by Sd we denote the set of atoms dehned as follows: Sd is the set of all atoms of the form 
p{tdd'), where d' is a ground substitution with the domain consisting of all bound variables in 
S such that for every i, 1 < i < m, dfisidd') holds (we recall that data and predefined relation 
symbols are fully specified by a data-program pair and this latter condition can be verified 
efficiently). We also note that if m = 0, all variables appearing in t are free and Sd = {p{td)}. 

Let us now consider a c-atom A = /{S'!;...; Sk}u appearing in a theory or a data-program 
pair T. Without loss of generality, we will assume that sets of bound and free variables 
appearing in c-atoms in T are disjoint. Let be a ground substitution whose domain does not 
contain any bound variables appearing in S*!,..., S'*,. We define Ad as follows: 

1. Ad = _L, if Zi? or ud are not integers appearing as constants in T 

2. Ad = ld{Sid U ... U Skd}ud, otherwise. In this case. Id and ud are integer constants 
appearing in T, and Sid U ... U Skd is the set of ground atoms. 

We define gr(T) as in Section with the stipulation that c-atoms are grounded as specified 
above. The ground theory gr{T) consists of propositional c-rules. We define a set M of ground 
atoms to be a model of T if it is a model of gr{T). 

We will now illustrate these definitions with an example. Let {D,P) be a data-program 
pair (in the language extended with c-atoms). Let us assume that 

D = {di{l),di{2),di{‘i),d2{a),d2{b)} 

and that 

C= dl{X)^X{p{X,Y)■.dl{Y)^Y>X■, q{Z)-.d 2 {Z)] 

is a rule in P. The variables Y and Z are bound in C, the variable X is free. Clearly, for every 
ground substitution d such that Xd = a or b, both the antecedent and the consequent of the 
rule ground to T and the rule grounds to 
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In every other ground substitution, X is replaced with 1, 2 or 3. Thus, we get the following 
three templates for propositional rules; 


di(l) ^ l{p{l,Y)-di{Y) AY >l-,q{Z):d 2 {Z)} 
di(l) ^ 2{p{2,Y)-di{Y) AY > 2; q{Z): d 2 {Z)} 
di(l) ^ 3M3, T): diiY) AY >3; q{Z): d 2 {Z)}. 

Set definitions in each of these rules specify sets of ground atoms and give rise to the following 
three ground instances of the rule C: 

di{l) l{p(l, l),(ii(l,2),(ii(3),g(a),g(6)} 
di(2) ^2{p(2,2),p(2,3),g(a),g(6)} 
di{3) 3{p{3, 3), q{a), q{b)}. 

From the last of these rules it follows that the atoms p(3,3), q{a) and q{b) must be true in 
every model of {D,P). 

In the extended logic PS+ we can encode the vertex cover problem in a more straight¬ 
forward and more concise way. Namely, there is no the need for integers to represent indices 
as sets are represented directly and not in terms of sequences! In this new representation 
{D’^^{G,k),Pi^), D'^^{G,k) is given by 

D'^^{G, k) = {vtx{v): u G F} U {edge{v, w): {v, rc} G E} U {size{k)}, 

and Py^ consists of the clauses: 

VC'l: invc{X) vtx{X) 

VC'2: size{K) {mvc{X): vtx{X)}K 
VC'3: edge{X, Y) invc{X) V invc{Y). 

Atoms invc{x) that are true in a model of the PS theory P'^^) define a set of vertices 
that is a candidate for a vertex cover. (VC'2) guarantees that no more than k vertices are 
included. (VC'3) enforces the vertex-cover constraint. 

Cardinality atoms also yield alternative encodings to the graph-coloring and n-queens prob¬ 
lems. In both cases, we use the same representation of input data and modify the program 
component only. In the case of the graph-coloring problem, a single rule, (C'3), directly stating 
that every vertex is assigned exactly one color, replaces two old rules (C3) and (C4). 

C'l: clrd{X,G) ^ vtx{X) 

C'2: clrd{X,G) ^ color{G) 

C'3: vtx{X) —> l{clrd{X,G): color{G)}l 
C'4: edge{X, Y) A clrd{X, G) A clrd{Y, C) ^ Y. 

In the case of the n-queens problem the change is similar. The rules (nQ3) and (nQ4) are 
replaced with a single rule (nQ'3) and the rules (nQ5) and (nQ6) with a single rule (nQ'4). 


nQ'I: 

q{R,G)- 

index(R) 

nQ'2: 

qiR,G)- 

index {G) 

nQ'3: 

index{R) 

—> l{q{R,G): index{G)}l 

nQ'4: 

index (G) 

—> l{q{R, G): index{R)}l 
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nQ'5: index{R) —> {q{R + I — 1,1) : index{I)}l 
nQ'6: index{C) —> {q{I, C + I — 1) : index{I)}l 
nQ'7: index{R) —> {q{R — 1+1,1) : index{I)}l 
nQ'8: index{C) —> {q{n — I + 1,C + I — 1) : index{I)}l. 

The rule (nQ'5) enforces the condition that the main ascending diagonal and all ascending 
diagonals above it contain at most one queen. The rule (nQ'6) enforces the same condition for 
the ascending main diagonal and all ascending diagonals below it. Finally, the rules (nQ'7) and 
(nQ'8) enforce the same condition for descending diagonals. In the original encoding we used 
only two clauses to represent these conditions. We could use them here again. However, the 
four clauses that we propose here, and that are possible thanks to the availability of c-atoms, 
result in significantly smaller ground theories. We address this issue in detail in Section 

6.2 Adding closure computation to logic PS+ 

In Section ^ we presented programs capturing the concepts of reachability in graphs and 
of transitive closure of binary relations. These representations are less elegant and, more 
importantly, less concise than representations possible in SLP. For instance, the transitive 
closure of a binary relation r can be computed by the following DATALOG^ program: 

TC'l: tc{X,Y) ^r{X,Y) 

TC'2: tc{X,Y) ^r{X,Z),tc{Z,Y). 

This encoding capitalizes on the minimality that is inherent in the stable-model semantics 
(in this case, the program, being a Horn program, has a unique least model). Moreover, the 
grounding of this program has size linear in the cardinality of the relation r. 

Constraints involving reachability, transitive closure and other related concepts are quite 
common. In the problem of existence of a Hamiltonian cycle in a directed graph, we may first 
constrain candidate sets of edges to those that span collections of disjoint cycles covering all 
vertices in the graph (for instance, by imposing the restriction that in each vertex exactly one 
edge from the candidate set starts and exactly one edge from the candidate set ends). Clearly, 
such a candidate set is a Hamiltonian cycle if and only if it is connected. This requirement can 
be enforced by the constraint that all graph vertices be reachable, by edges in the candidate 
set, from some (arbitrary) vertex in the graph. 

With this motivation in mind, we will now introduce yet another extension of the basic 
logic providing, in particular, means to express constraints involving reachability, connectivity, 
transitive closure and similar related concepts in a way they are used in SLP. To this end, we 
extend both the syntax and the semantics of the logic PS+. 

As it is standard, by a Horn rule we mean a PS rule (a rule without cardinality atoms) 
whose consequent is a single regular atom (that is, not an e-atom). Horn rules play a key role 
in this extension of the logic. The idea is to split the program component in a data-program 
pair into three parts. Intuitively, the first of them will describe initial constraints on the 
space of candidate solutions. The second, consisting of Horn rules, will “close” each candidate 
generated by the first part. The third component will provide additional constraints that have 
to be satisfied by the closure. 

Formally, by an extended program we mean a triple {G, H, V) such that 

I. G and V are collections of (arbitrary) PS+ rules, called generating and verifying rules, 
respectively, and H is a collection of Horn rules 
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2. No relation symbol appearing in the consequent of a rule in H appears in rules from G. 

An extended data-program pair is a pair {D, P), where D is a set of ground atoms (data) and P 
is an extended program. When listing an extended program, we use the following convention. 
We write Horn rules as in logic programming, starting at the left with the head, followed by 
the (reversed) arrow <— as the implication connective and, finally, followed by the conjunction 
of the atoms of the body. There is no need to explicitly distinguish between rules in G and V 
as the partition is implicitly defined by H. Namely, non-Horn rules involving relation symbols 
appearing in the consequents of Horn rules form the set V. All other non-Horn rules form the 
set G. 

Let {D, P) be an extended data-program pair, where P = {G, H, V). A set of ground atoms 
from the Herbrand base of (D, G) is a model of {D, P) if 

1. M is a model of (D, G) 

2. the closure of M under H, that is, the least Herbrand model of the Horn theory M U H, 
satisfies all ground instances of rules in V. 

The first condition enforces that models of {D, P) satisfy all constraints specified by G. Thus, 
G can be regarded as a generator of the search space, as there are still additional constraints 
to be satisfied. The second condition eliminates all these models generated by G whose closure 
under H violates some of the constraints given by V. In other words, H computes the closure 
and V verifies whether the closure has all of the desired properties. 

As an illustration of the way this extension of logic PS+ can be used we will provide a 
formal representation of the Hamiltonian-cycle problem, capturing intuitions described above. 
Let G = (y, E) be a directed graph and let vq be an arbitrary vertex in V. To represent this 
data we set 


D'yG, Vo) = {vtx{v):v G H} U {edge{v, w): {u, rc} G E} U {start{vo)}■ 

Formally speaking, for the Hamiltonian-cycle problem, there is no need to include vq in the 
data set. We do it, as our encoding involves the notion of reachability, for which some arbitrary 
“starting” point is needed. The (extended) program part, consists of the following five 
rules. 

HCT: hc.edge{X, Y) edge{X, Y) 

HC'2: l{hc.edge{Y,X):vtx{Y)}l. 

HC'3: l{hc.edge{X,Y):vtx{Y)}l. 

HC'4: visit{Y) <— visit{X) A hc-edge{X,Y) 

HC'5: visit{Y) <— start{X). 

HC'6: visitiX). 

We note the use of our notational convention. Clearly, the rules (HC'4) and (HC'5) form 
the Horn part (it is indicated by the way they are written). It follows that the rules (HC'l)- 
(HC'3) are generating and the rule (HC'6) is verifying. Intuitively, the rule (HC'l) guarantees 
that if an atom hc-edge{x,y) is true then, {x,y) is an edge (in other words, only edges of 
the graph can be chosen to form a Hamiltonian cycle). Rule (HC'2) captures the constraint 
that for every vertex x there is exactly one selected edge that ends in x. Similarly, the rule 
(HC'3) captures the constraint that for every vertex x there is exactly one selected edge that 
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starts in x. Thus, every model of the data-program pair consisting of Dhc{G, vq) and the rules 
(HC'1)-(HC'3) contains DficiG,vo) and a set of atoms hc_edge{x,y) that describe a particular 
selection of edges and that span in G disjoint cycles covering all its vertices. Rules (HC'3) and 
(HC'4) define the relation visit that describes all vertices in G reachable from vq by means of 
selected edges. Finally, the last rule verifies that all vertices are reached, that is, that selected 
edges form, in fact, a Hamiltonian cycle. 


6.3 Expressive power of extended logics 

We close this section with an observation on the expressive power of the logic PS+. Since it 
is a generalization of the logic PS, it can capture all problems that are in the class NP-search. 
On the other hand, the search problem of computing models of a data-program pair {D,P), 
where P is a fixed PS+ program, is an NP-search problem (a simple modification of the proof 
of the second assertion of Theorem 5^ demonstrates that). Thus, it follows that the expressive 
power of the logics PS+ does not extend beyond the class NP-search. In other words, the logic 
PS+ also captures the class NP-search. 


7 Computing with PS-\- theories 

In the preceding sections, we focused on the use of the logic PS+ as a language for encoding 
(programming) search problems and established its expressive power. In order to use the logic 
PS+ as a computational problem solving tool we need algorithmic methods for processing 
data-program pairs and finding their models. 

Let us recall that a set M of ground atoms is a model of a data-program pair [D,P) if 
and only if M is a model of the theory gr{cl{D) U P)). Thus, to compute models one could 
proceed in two steps: first, compute gr{cl{D)UP)) and then, find models of the ground theory. 
We refer to these steps as grounding and solving, respectively. This two-step approach is used 
successfully by all current implementations of SLP including, smodels and dlv. We will adhere 
to it, as well. 

It is easy to see that the data complexity of grounding is in the class P. That is, there is an 
algorithm that, for every data-program pair (D, P) computes gr{cl{D)UP)) and, assuming that 
P is fixed, works in time that is polynomial in the size of D. For instance, a straightforward 
enumeration of all substitutions of appropriate arities (determined by the numbers of free 
variables in program rules) can be adapted to yield a polynomial-time algorithm for grounding. 

This straightforward approach can be improved. The size of grounding (although poly¬ 
nomial in the size of the data part) is often very big. To address this potential problem, we 
note that to compute the models it is not necessary to use gr{cl{D) UP)). Any propositional 
theory that has the same models as gr{cl{D) U P)) can be used instead. In this context, let 
us note that the truth values of all ground atoms appearing in gr{cl{D) UP)) that are built 
of data relation symbols can be computed efficiently by testing whether they are present in 
D. Similarly, we can effectively evaluate truth values of all ground atoms built of predefined 
relation symbols by, depending on the relation, checking whether two constants are identical, 
different or, in the case of integer constants, whether one is the sum, product, etc. of two other 
integer constants. 

Thus, the theory gr{cl{D) U P)) can be simplified by taking into account the truth values 
of ground atoms built of data and predefined relation symbols. Let A be such a ground atom. 
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1. If ^ appears in the consequent of the clause and is true, we eliminate this clause 

2. If A appears in the consequent of the clause and is false, we eliminate A from the 
consequent of the clause 

3. If A appears in the body of a clause and is true, we eliminate A from the body. 

4. If A appears in the body of the clause and is false, we eliminate this clause. 

These simplifications may reveal other atoms with forced truth values and the process 
continues, much in the spirit of unit propagation used in satisfiability solvers. For instance, if 
we obtain a rule consisting of a single (regular) atom and this atom appears in the consequent 
of the rule, the atom must be true. If, on the other hand, this single atom appears in the body 
of the rule, it must be false. Furthermore, if a cardinality atom of the form m{pi,... ,Pk}n, 
is forced to be true and the number of atoms pi that have been already assigned value true 
is m, then all the unassigned atoms pi must be false. In addition, if the number of atoms pi 
that have been already assigned value false is k — m, then all unassigned atoms must be true. 
Similar propagation rules exist for the case when a c-atom is forced to be false. 

We continue the process of simplifying the theory as long as new atoms with forced truth 
values are discovered. We call the theory that results when no more simplifications are possible 
the ground core of a data-program pair {D,P). We denote it by core{D, P). 

We have the following straightforward result (as in the other cases before, we do not 
explicitly mention ground predefined atoms when specifying models). 

Proposition 7.1 Let [D,P) he a data-program pair. A set M of ground atom is a model of 
{D, P) if and only if M = D U T L) M', where T is the set of atoms that are forced to be true 
and M' is a model of eore{D, P). 

Proposition suggests that for the grounding step, it is enough to compute eore{D,P) 
rather than gr{cl{D) U P)). It is an important observation. The size of the theory core{D, P), 
measured as the total number of symbol occurrences, is usually much smaller when compared 
to that of gr[cl{D)UP)). Following this general idea, we designed and implemented a program, 
psgrnd that, given a data-program pair (D,P), computes its ground equivalent core{D, P). 

We will now focus on the second step — searching for models of a propositional PS-\- theory. 
First, we will consider the class of theories that are obtained by grounding data-program pairs 
whose program component does not contain c-atoms. In this case, the ground core of a data- 
program pair is a collection of standard propositional clauses (written as implications). The 
program psgrnd provides an option that, in such case, produces the ground core of the input 
data-program pair in the DIMACS format. Consequently, most of the current implementations 
of propositional satisfiability (SAT) solvers can be used in the solving step to compute models. 
Thus, we can view the logic PS as a programming tool for modeling problems in terms of 
propositional constraints and regard psgrnd as a front-end facilitating the use of SAT solvers. 

If c-atoms and Horn rules are present in a program, the theory after grounding and simpli¬ 
fication is a propositional PS-\- theory that contains, in general, (propositional) c-atoms and 
propositional Horn rules. Thus, SAT solvers are not directly applicable. One approach in such 
case is to represent c-atoms and closure rules by means of equivalent (standard) propositional 
theories. It is possible since, as we noted earlier, logics PS and PS-\- have the same expressive 
power. 
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We argue, however, that a more promising approach to compute models of data-program 
pairs is to design solvers for propositional PS+ theories that are direct outcomes of the ground¬ 
ing process and, in general, may contain c-atoms. The reason is that using high-level constraints 
results in programs whose ground representations are often more concise then those obtained 
by corresponding programs that do not involve such constraints. We will illustrate this point 
using programs developed earlier in the paper. 

We start with the vertex-cover problem. Let G be an input graph with n vertices and m 
edges, and let k be an integer k specifying the cardinality of a vertex cover. In the case of the 
program consisting of rules (VCl) - (VC5), our grounding algorithm results in a propositional 
theory with kn atoms of the form vc{i,x) and with Q{k'n?) rules of total size (measured by 
the number of atom occurrences) also Q{kn?). On the other hand, grounding of the program 
consisting of rules (VC^l) - (VC'3) yields a theory with n atoms of the form invc{x) and with 
0(m) rules of total size Q{n + m). Thus, this latter encoding involves fewer atoms (if k >2) 
and has the size that is asymptotically smaller. 

Next, we will consider the Hamiltonian-cycle problem. Our first encoding (rules (HCl) - 
(HC6)) grounds to a theory with v? atoms and with total size 0(n^ -|-n(n^ — m)). Our second 
encoding, involving Horn rules (rules (HCH) - (HC^7)), grounds to a theory with 'n? + n atoms 
and the total size of 0(n^). Thus, even though this theory uses slightly more atoms, it has 
significantly smaller total size (except for the case of “almost complete” graphs, that is, graphs 
with the number of “missing” edges equal to o(n^), one-order of magnitude smaller). 

For the original encoding of the n-queens program, psgrnd produces a propositional theory 
of size 0(n^). On the other hand, it is easy to see that grounding of the the second encoding 
(the one involving cardinality atoms), has size 0(n^) — a gain of an order of magnitude. 

In the case of the encodings for the graph-coloring problems, we also obtain more concise 
theories by grounding programs designed with the use of c-atoms. Indeed, the rule (C'3) 
grounds to a smaller theory than rules (C3) and (C4). The improvement is, in general, by a 
constant factor and so, it is not asymptotically better. 

Since encodings involving c-atoms are usually smaller and define smaller search spaces, it is 
important to design solvers that can take direct advantage of these small representations. We 
developed a solver, aspps (short for “answer-set programming with propositional schemata”), 
that can directly handle c-atoms and closure rules. The aspps solver is an adaptation of the 
Davis-Putnam algorithm for computing models of propositional CNF theories. That is, it is a 
backtracking search algorithm whose two key components are unit propagation and branching. 
Unit propagation “propagates” through the theory truth values established so far. If there is a 
rule with all atoms in the antecedent assigned value true and all but one atom in the consequent 
assigned value false, then the remaining “unassigned” atom in the consequent must be true 
for the rule to hold. Similarly, if all atoms in the consequent of a rule are false and if all but 
one atom in the antecedent are true, the only “unassigned” atom in the antecedent must be 
false. In this way any partial assignment of truth values to atoms forces truth assignments on 
some additional atoms. When no more atoms can be forced, the second module, branching, 
selects a way to split search space into separate parts. When search in a part fails, the program 
backtracks and tries another. 

A key difference between aspps and satisfiability algorithms is in how branching is imple¬ 
mented. In satisfiability solvers, in order to branch, we pick an atom, say a, and split the 
search space into two parts. In one of them we assume that the atom a is true. In the other 
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one we assume that a is false. Propositional PS+ theories may, in general, contain c-atoms 
and aspps considers them too when selecting a way to branch. 

To explain the method that aspps uses, let us observe that the unit propagation may, in 
particular, assign a truth value to a c-atom appearing in the theory. That constrains possible 
truth assignments to unassigned atoms that form the c-atom. 

For example, let us consider a propositional c-atom C = 1 {a, b, c, d} 1 that we know must 
be true. Let us assume that d has already been assigned value false and that a, b and c have 
not. There are exactly three ways in which atoms a, b and c can be assigned truth values 
consistent with C being true: 

a = t,6 = f, c = f 
a = f, 6 = t,c = f 
o = f, 6 = f, c = t. 

It follows that if a truth value of a c-atom C has been forced, we have an additional way to 
split the search. Namely, we can consider in turn each truth assignment to unassigned atoms 
appearing in C that is consistent with the truth value of C. In our example, if C is true, we 
could split the search space into three subspaces by assuming first that a = t, 6 = f and c = f , 
then that a = f, b = t and c = f and, finally, that a = f , 6 = f and c = t. 

The choice of the way to branch is of vital importance. To make this selection, the aspps 
program approximates the degree to which the atom is constrained. That is, aspps first assigns 
to each clause a weight based on its current length. The fewer atoms in a clause the more 
constraining it is and the greater its weight. The weight of a (regular) atom is defined as the 
sum of the weights of all clauses containing it. It is this number that aspps uses to estimate 
how much the atom is constrained. 

When looking for the way to branch, the aspps program considers all (regular) atoms that 
have not been assigned a truth value yet. It also considers some c-atoms. Let C be a c-atom 
that has been forced to be true by earlier choices. Let A be the set of atoms appearing in C 
that have not received a truth value yet. The atom C is considered as a candidate to define 
branching if the number of truth assignments to atoms in A that are consistent with the truth 
value of C (the number of branches C defines) is less than or equal to |A|. 

If there are c-atoms satisfying these conditions, aspps will select this one among them that 
maximizes the sum of weights of unassigned atoms that appear in it. Otherwise, aspps will 
branch on a regular atom with the maximum weight. If a propositional PS+ theory contains 
Horn clauses, they play no role in the process of selection the next atom for branching. They 
participate though in the unit propagation step. 

The source codes, information on the implementation details for programs psgrnd and aspps 
and on their use is available at jittp : //www. cs .uky, edu/ai/aspps/ . 

8 Experimental results 

Several data-program pairs that we presented in the paper (both with and without c-atoms and 
Horn rules) show that the logics PS and PS+ are effective as formalisms for modeling search 
problems. In this section we will demonstrate computational feasibility of these formalisms 
when combined either with our native solver aspps, specifically tailored to handle the syntax 
of the logic PS+ (c-atoms and Horn rules), or with off-the-shelf SAT solvers. 
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We show that aspps is generally comparable in performance with that of smodels and, in 
the cases discussed here, even faster. We chose smodels for the comparison since (1) smodels 
accepts a similar syntax as aspps, (2) in the case of each problem considered here, there is 
an smodels program essentially identical in its basic structure to the PS+ program presented 
in the paper, and (3) smodels is at present one of the most advanced implementations of 
answer-set programming paradigm based on DATALOG”' with the stable-model semantics. 

Next, we show that our language of data-program pairs in the basic logic PS (without 
c-atoms and Horn rules), together with the program psgrnd, greatly simplifies the use of SAT 
solvers in computing solutions to search problems. 

Finally, we compare the performance of aspps and smodels with that of SAT solvers as 
engines for solving search problems. 

We stress that our experiments did not aim at demonstrating superiority of one solver over 
another. That would require a much more comprehensive and careful experimental study. Our 
objective was to demonstrate the feasibility of our approach. 

For our test cases we selected problems that we used as examples throughout the pa¬ 
per: the n-queens, graph-coloring, vertex-cover, and Hamiltonian cycle problem. We used 
psgrnd/aspps and smodels for encodings in the logic PS+ (that is, encodings involving c- 
atoms and Horn rules). We used the combinations psgrnd/satz and psgrnd/chaff fox programs 
without c-atoms and Horn rules. In the experiments we used the following versions of these 
programs: zchajj \ MMZ~^01h ], satz215.2 [[Li97|] , lparse-1.0.6 and smodels-2.26 hoih. at | NSS97 ], 
aspps.2001.10.18, psgrnd.2001.10.18 and, finally, psgrnd.2002.10.11 (as a front-end for satisfia¬ 
bility solvers) ||ET01a | . All our experiments were performed on a Pentium IV 1.7 GHz machine 
running linux. 

In the case of vertex cover, for each n = 50,60, 70,80 we randomly generated 100 graphs 
with n vertices and 2n edges. For each graph G, we computed the minimum size kQ for which 
the vertex cover can be found. We then tested aspps, smodels and satz on all the instances 
{G,kG). The results (Table represent the average execution times. Encodings we used 
for testing aspps and smodels where based on the program (VG'l) - (VG'3). For satisfiability 
solvers we used encodings based on the clauses (VGl) - (VG5) (as cardinality constraints cannot 
be handled by satisfiability solvers). 

As we observed, the size of the encoding (VG'l) - (VG'3) is, in general, asymptotically 
smaller than that of (VGl) - (VG5). Thus, satisfiability solvers had to deal with much larger 
theories (hundreds of thousands of clauses for graphs with 80 vertices as opposed to a few 
hundred when c-atoms are used). Gonsequently, they did not perform well. 

As concerns smodels versus aspps, in general aspps is somewhat (about three times) faster 
than smodels and the difference seems to grow with size. 


n 

50 

60 

70 

80 

aspps 

0.011 

0.070 

0.463 

1.996 

smodels 

0.043 

0.116 

1.584 

8.157 


Table 1: Timing results (in seconds) for the vertex-cover problem. Average time for each set of 100 
random generated graphs; satz and zchaff were halted after 10 minutes on a single instance. 


For the n-queens problem, our solver performed exceptionally well. Gardinality constraints 
play here again a crucial rule. Aspps has to work with theories obtained by grounding the 
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program (nQ'l) - (nQ'8). In contrast, satz and zchaff have to work with much larger theories 
obtained by grounding the program (nQl) - (nQ7). When n = 70 this means the difference 
between 416 and 562030 rules, respectively (in each case, the number of ground atoms is 4900). 
Both solvers required therefore much more time than aspps. In fact, we stopped satz after 10 
minutes on the 40-queen instance. Zchaff peiformed much better than satz and completed the 
computation in the case of n = 70 in just under 10 minutes. Given the size of the theory it 
has to work with, this is quite remarkable. One lesson, we believe, is that there is still a large 
potential for improvements in the way aspps implements search. We also note that aspps and 
zchaff sohreis both exhibited a somewhat irregular performance growth pattern as the number 
of queens increased. Lastly, we note that smodels did not complete computation for n = 40 in 
the 10 minutes we allocated, despite the fact that we used a more concise encoding, similar to 
that processed by aspps. Table summarizes these results. 


ff of queens 

40 

50 

60 

70 

80 

aspps 

0.20 

0.06 

2.19 

0.31 

0.17 

zchaff 

158.92 

157.74 

283.87 

558.10 

*** 


Table 2: Timing results (in seconds) for the n-queen problem; satz and smodels were halted after 10 
minutes on the instance with n = 40. 


In the case of the graph colorability problem, as we observed in the previous section, 
c-atoms do not give rise to significant gains in the size of the ground theory. Given the 
amount of research devoted to satisfiability solvers and still relatively few efforts to develop 
fast solvers for logics involving cardinality constraints, it is not surprising that satisfiability 
solvers outperform both aspps and smodels. Our results also show satz outperforming zchaff, 
which may be attributed to the fact that our test graphs were randomly generated and did 
not have any significant internal structure that could be capitalized on by zchaff. As concerns 
aspps and smodels, they show essentially the same performance. We summarize the relevant 
results in Table |^. The graphs for the 3-colorability problem were generated randomly with 
vertex/edge ratios such that approximately 1/2 of the graphs were 3-colorable. For each value 
n = 100, 200 and 300, we generated a set of 1000 graphs. The values that we report are the 
average execution times. 


n 

100 

200 

300 

aspps 

0.006 

0.302 

13.678 

smodels 

0.026 

0.495 

16.043 

satz 

0.013 

0.077 

1.416 

zchaff 

0.002 

0.107 

7.952 


Table 3: Timing results (in seconds) for the graph 3-coloring problem. 


Our last experiment concerned the problem of computing Hamiltonian cycles. As in other 
cases when additional constraints (transitive closure computation, in this case) result in much 
smaller theories, both smodels and aspps outperform satz and zehaff. In addition, in this case, 
aspps significantly outperforms smodels. In the experiments, we considered graphs with 20, 40, 
60, 80 and 100 vertices with the number of edges chosen so that the likelihood of the existence 
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of a Hamiltonian cycle is close to 0.5. For each set of parameters, we generated 1000 instances. 
The times given in Table ^ represent average execution times. 


V/E 

20/75 

40/180 

60/300 

80/425 

100/550 

aspps 

0.000 

0.001 

0.002 

0.003 

0.005 

smodels 

0.006 

0.034 

0.117 

0.255 

0.456 

satz 

0.122 

*** 

*** 

*** 

*** 

zchaff 

0.144 

2.277 

33.405 

*** 

*** 


Table 4: Timing results (in seconds) for the determining presence of a Hamilton cycle in a graph. 


It is clear from these results that our solver aspps is competitive with smodels and SAT 
solvers such as zchaff and satz as a processing back-end for problems encoded as data-program 
pairs in the logics PS and PS+. 

9 Conclusions 

Our work demonstrates that predicate logic and its extensions can support answer-set pro¬ 
gramming systems in a way in which stable logic programming does. To put it differently, we 
show that predicate logic can be an effective declarative programming formalism. 

In the paper we described logic PS that can be used to uniformly encode search problems. 
We proved that the expressive power of this logic is given by the class NP-search. Thus, it is 
the same as the expressive power of DATALOG”', even though it is conceptually simpler — its 
semantics is essentially that of propositional logic. 

We demonstrated the use of our logic in modeling such search problems as graph coloring, 
vertex cover, re-queens, Hamiltonian cycle and transitive closure. 

We designed a program psgrnd, that given a data-program pair in the logic PS, encoding 
a search problem, computes its equivalent propositional representation in the DIMACS form. 
In this way, it becomes possible to compute models of data-program pairs and, consequently, 
solve the corresponding search problems by means of standard off-the-shelf satishability solvers. 
We demonstrated that the approach is feasible and effective by applying satz and zchaff to 
propositional theories produced by psgrnd. 

We argued that the logic PS can benefit from extensions allowing explicit representations 
of some commonly used constraint such as cardinality constraints and transitive closure. En¬ 
codings of search problems that take advantage of these extensions are usually much smaller 
and, consequently, could result in smaller search spaces if solvers capable to take advantage 
of direct representations of high-level constraints were available. We designed one such solver, 
aspps. Our experimental results are encouraging. Aspps is competitive with smodels, a state- 
of-the-art processing engine for DATALOG”' programs extended by cardinality constraints and 
other constructs. In fact, in several cases, aspps outperforms smodels. 

The results of the paper show that programming front-ends for constraint satisfaction 
problems that support explicit coding of complex constraints facilitate modeling and result 
in concise representations. They also show that solvers such as aspps that take advantage of 
those concise encodings and process high-level constraints directly, without compiling them 
to simpler representations, exhibit very good computational performance. These two aspects 
are important. Satisfiability checkers often cannot effectively solve problems simply due to 
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the fact that encodings they have to work with are large. For instance, for the vertex-cover 
problem for graphs with 80 vertices and 160 edges, aspps has to deal with theories that consist 
of a few hundred of rules only. In the same time pure propositional encodings of the same 
problem contain over one million clauses — a factor that undoubtedly is behind much poorer 
performance of satz and zchaff on this problem. 

Our work raises new questions. Further extensions of logic PS+ are possible. For instance, 
constraints that impose other conditions on set cardinalities than those considered here (such 
as, the parity constraint) might be included. We will pursue this direction. Similarly, there is 
much room for improvement in the area of solvers for the propositional logic PS+ and aspps 
can certainly be improved. There is also a potential for developing local search techniques for 
the logic PS+. The task seems much easier than in the case of DATALOG”' programs, where 
finding successful local search algorithms turned out to be hard |DS02]. 
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